The mid-domain effect in flowering phenology

The timing of flowering is an important driver of species distribution and community assembly patterns. However, we still have much to learn about the factors that shape flowering diversity (i.e., number of species flowering per period) in plant communities. One potential explanation of flowering diversity is the mid-domain effect, which states that geometric constraints on species ranges within a bounded domain (space or time) will yield a mid-domain peak in diversity regardless of ecological factors. Here, we determine whether the mid-domain effect explains peak flowering time (i.e., when most species of communities are flowering) across China. We used phenological data of 16,267 herbaceous and woody species from the provincial Flora in China and species distribution data from the Chinese Vascular Plant Distribution Database to determine relationships between the observed number of species flowering and the number of species flowering as predicted by the mid-domain effect model, as well as between three climatic variables (mean minimum monthly temperature, mean monthly precipitation, and mean monthly sunshine duration). We found that the mid-domain effect explained a significant proportion of the temporal variation in flowering diversity across all species in China. Further, the mid-domain effect explained a greater proportion of variance in flowering diversity at higher latitudes than at lower latitudes. The patterns of flowering diversity for both herbaceous and woody species were related to both the mid-domain effect and environmental variables. Our findings indicate that including geometric constraints in conjunction with abiotic and biotic predictors will improve predictions of flowering diversity patterns.


Introduction
The seasonal timing of flowering underpins several aspects of community assembly and ecosystem function (Post et al., 2001;Cleland et al., 2006).Although some plant species flower throughout the growing season (Fenner, 1998), in most plant communities, flowering of most plant species overlaps at a specific point in time, which is referred to as the peak flowering time.Peak flowering times are important because their patterns often involve the co-evolved behaviors of plants and pollinators, which may contribute to community assembly in ways involving species diversity and dominance patterns (Heithaus, 1974;Rathcke and Lacey, 1985).Thus, shifts in peak flowering times have broad implications for ecological interactions, including attraction of mutualists, avoidance of antagonists, and planteplant competition (via changes in seedling density and emergence times), which could thus reshape plant communities (Wright, 1996;Thies and Kalko, 2004;Pau et al., 2011;CaraDonna et al., 2014).Although we know flowering times of individual species are affected by environmental factors such as temperature, precipitation, and irradiance (Frankie et al., 1974;Reich and Borchert, 1984;Wright and Vanschaik, 1994), we still have much to learn about the forces that shape peak flowering time at the community scale.
The mid-domain hypothesis, which was introduced to help explain latitudinal gradients in species diversity (Colwell and Hurtt, 1994;Colwell and Lees, 2000), predicts that if ranges of species are randomly placed within a bounded domain, the greatest overlap of ranges, and thus species richness, will occur at the center of the domain (Colwell and Hurtt, 1994).This mid-domain effect has been used to explain latitudinal and elevational gradients of species richness in both plants (Colwell et al., 2004) and animals (Chen et al., 2020;Chettri and Acharya, 2020;Siqueira et al., 2021).Mid-domain effects have also been proposed for temporal gradients, whereby randomly distributed flowering periods will tend to overlap most frequently at the mid-domain, which is in this case during the middle of the growing season (Morales et al., 2005;Letten et al., 2013).If flowering time is neutral within a growing season and is uninfluenced by species interactions and environmental conditions, then a null model such as the mid-domain hypothesis may accurately predict patterns of flowering phenology at the community level.The premise of this mid-domain effect in flowering phenology is that the onset of the growing period constrains the timing of flowering in most species, with its end setting a hard boundary for fruiting (and indirectly, flowering) because seed production must be completed before plant growth ceases (Morin and Chuine, 2006).One of the few existing studies that has tested the mid-domain effect in flowering diversity (number of species flowering per period) found that the "mid-domain" null model qualitatively matched the pattern of flowering species richness in a sub-alpine forb community in Colorado, USA (Morales et al., 2005).
Previous studies have indicated that phenological diversity (i.e., the number of species flowering per month) may be affected by latitude and growth habit.At higher latitudes, plant assemblages such as temperate forests are thought to be formed predominantly by deterministic processes (Chase and Myers, 2011).For these plants, flowering time is driven by the avoidance of harsh environmental conditions.When this occurs, the mid-domain flowering patterns may deviate from null model expectations.At lower latitudes, where environmental conditions are mild, plant assemblages (e.g., tropical and subtropical forests) are formed by stochastic or neutral processes (Hubbell, 2001;Legendre et al., 2009).Under these conditions the null model (i.e. the middomain hypothesis) may predict patterns of flowering phenology (Pau et al., 2011;Craine et al., 2012).However, subtropical regions (e.g., South China) can experience large seasonal variations in temperature, with short-lived cold surges that restrict flowering for most species (Dudgeon and Corlett, 1994).Thus, it is important to examine whether the contribution of the mid-domain effect to patterns of phenological diversity varies with latitude.
Flowering in both woody species and herbaceous species is thought to be controlled by a few key environmental cues.Most temperate woody plants flower in response to temperature (Weiser, 1970;Du et al., 2020), although the flowering of woody species, especially in seasonal tropical forests, is often induced by rainfall (Rathcke and Lacey, 1985).The onset of flowering in woody species may also be influenced by other factors such as solar irradiance (van Schaik et al., 1993).Phenological periods of woody species may converge on the same few months with favorable environmental conditions despite the potential costs of competition (Dunn et al., 2007).For many herbs, flowering has been shown to be affected by cold temperature (e.g., Borchert, 1983;Inouye, 2008;Panchen et al., 2012;Matthews and Mazer, 2016;Augspurger and Salk, 2017;Ehrlen and Valdes, 2020).For example, shoot elongation and flowering of herbs can occur very quickly after snow-release or temperature increase (Billings and Mooney, 1968).
In this study, we tested whether peak flowering time is at least partially explained by the mid-domain effect.For this purpose, we used phenological data of 16,267 herbaceous and woody species from the provincial Flora in China and species distribution data from the Chinese Vascular Plant Distribution Database to explore the relationships between observed number of species flowering and the number of flowering species as predicted by the middomain effect, as well as three climatic variables (mean minimum monthly temperature, mean monthly precipitation, and mean monthly sunshine duration).We hypothesized that (1) the middomain effect partially explains variation in peak flowering time of plants at the regional scale; (2) the mid-domain effect is more pronounced in flowering patterns of species from lower latitudes than from those at high latitudes; (3) the flowering patterns of both herbaceous and woody species are correlated with both the middomain effect and environmental variables.

Species distribution and phenology data
Species distribution data was obtained from the Chinese Vascular Plant Distribution Database (Xu et al., 2018), which was compiled from over six million specimens and more than 1000 published floras, checklists, and inventory reports.All records in this database have been georeferenced to the county level, and there are 30,811 species with distribution data.
We studied the mid-domain effect at the provincial level.Flowering phenology data was available from local Flora from 27 provinces; county level phenology data does not exist at present.For each province, we scored a species as being present if it was found in any county in the province.Three pieces of information were compiled: the earliest month and the latest of flowering and the growth form (woody or herbaceous).We calculated the midpoint of the flowering period listed in the Flora.Although this estimate of flowering time is somewhat biased because rightskewed flowering curves are common, the midpoint has frequently been used in studies of flowering phenology across communities or floras when precise phenological data is unavailable (e.g., Kochmer and Handel, 1986;Johnson, 1993;Wright and Calderon, 1995;Warren et al., 2011;Du et al., 2020).Species with incomplete flowering information were excluded from the study.There were 16,267 species with flowering data, representing 2218 genera and 251 families (5721 woody and 10,546 herbaceous species).
We did not test the mid-domain effect at the whole-country level.This is because in a country level analysis, a species' flowering date accounts for flowering dates across all the provinces where the species is distributed (meaning widely-distributed species that flower at different times in high and low latitude provinces will have wide flowering ranges).This issue does not apply to provincelevel analyses, where province-level flowering data can be appropriately used.

Climatic data
Flowering in plant species has been shown to be regulated by three key environmental variables: temperature, precipitation, and solar irradiance (Frankie et al., 1974;Reich and Borchert, 1984;Wright and van Schaik, 1994).To examine the influence of climate on the number of species flowering each month, we extracted data from 2015 to 2020 on three environmental variables from the China Meteorological Data Service Centre (http://data.cma.cn):mean minimum monthly temperature ( C, T min ), mean monthly precipitation (mm, MMP), and mean monthly sunshine duration (h, Sunshine).Minimum temperature, rather than mean temperature, is usually thought to determine the species distribution and flowering phenology (Augspurger and Salk, 2017;Ehrlen and Valdes, 2020).Sunshine acts as an index of irradiance and is defined as the sum of all time periods during the day when the direct solar irradiance equals or exceeds 120 W/m 2 .

Data analyses
The mid-domain effect null model was used to test the influence of geometric constraints (in the complete absence of any supposition of environmental gradients within the domain; for details, see Colwell and Lees, 2000) on the temporal patterns of species richness along a monthly gradient.In addition, the null model was constrained by the empirical ranges of the flowering periods in our study.We explored whether the null model of a phenological mid-domain effect could qualitatively match the observed mid-season increase in species flowering.We first tested the mid-domain effect for each of the 27 provinces.We used the random placement model (model 4 in Colwell's RangeModel; see Colwell and Lees, 2000) to generate a predicted species richness pattern under geometric constraints.A flowering season (the domain) of n months was considered.To use a mid-domain model to investigate the temporal patterns of flowering species richness, we reduced the occurrence of each species' observed flowering period to two parametersdits midpoint and its duration.The calendar year was divided into 12 month-long segments.We used different flowering domains for different provinces.For example, in Heilongjiang Province, we set April as the start season and October as the end season, according to the flowering duration of all species from this province.To apply mid-domain models to month segments, observed flowering ranges of species were randomly assigned to the domain.This model randomly chose ranges from the empirical frequency distributions of range sizes and placed them within the domain at random.This practice thus eliminated bias due to the difference between theoretical frequency distributions of range sizes and empirical ones.We then "reshuffled" the observed ranges in the domain by resampling with replacement so that each range was assigned a new random midpoint based on possible midpoints imposed by the range of the flowering period (Morales et al., 2005).We implemented the simulation program in RangeModel v.5 (Colwell, 2004).We conducted 5000 simulations of random range placement within the bounded domain.The computed mean richness and its 95% confidence interval was used to evaluate the explanatory power of the mid-domain effect for the empirical number of species flowering each month.
We first conducted generalized linear models to evaluate the relationships between the observed number of species flowering and the mid-domain effect model-predicted number of species flowering.The R 2 and P values from linear regressions of the observed number of species flowering against the predicted number of species flowering were computed to assess the fit of the middomain effect models.We also conducted linear regressions between variance explained by the mid-domain effect models as a function of latitude for all pooled species, for herbs and woody species, respectively.
Because the shape and midpoint of a modeled mid-domain effect are dependent on boundary locations, caution should be taken when fitting a model for mid-domain effect if flowering is possible all year long, as in the southernmost provinces.Here we tested the sensitivity of our conclusions to a rule-defined method of setting the boundaries.We did this by computing the dates on which 2.5% and 97.5% of cumulative flowering records had occurred in each province and used the dates as domain boundaries.Then, we repeated the computations for 5% and 95% of cumulative flowering records instead to test the sensitivity of the patterns to this method.We included the results of the 5% and 95% cumulative flowering records in the main text because the boundaries were stricter, but the results of the 2.5% and 97.5% cumulative flowering records can be found in the supplementary materials.
To examine the relative importance of the mid-domain effect model and climatic variables in shaping the phenological patterns for pooled species, for herbaceous species and for woody species, linear mixed-effects models were used to model the number of species flowering (log-transformed) as a function of mid-domain effect (MDE, log-transformed), mean minimum monthly temperature ( C, T min ), mean monthly precipitation (mm, MMP), and mean monthly sunshine duration (h, Sunshine), with a random intercept for province.The 'lmer' function in R package 'lme4' was used to create the model.We used the variance inflation factor (VIF) to assess the multicollinearity among predictors, and we found that all VIFs were less than 3, indicating that collinearity was not a problem for our multiple regression models.The individual contribution of each selected fixed-factor was determined using the 'glmm.hp'package in R, which was developed to evaluate the relative importance of predictors in linear mixed models (Lai et al., 2022(Lai et al., , 2023)).All calculations were performed using the software R v.4.3.1 (R Core Team, 2023).

Mid-domain effect in peak flowering phenology
For the dates on which 5% and 95% of cumulative flowering records occurred, the null model explained a significant proportion of the variation in the number of flowering species across the temporal gradient in all 27 provinces for all species pooled (Figs. 1  and 2a) and for herbaceous species (Table 1 and Fig. 2b).For woody species, we found a significant or marginally significant (0.05 < P < 0.10) relationship between the empirical species richness and the predicted richness for 14 out of 27 provinces (Table 1 and Fig. 2c).The mid-domain effect model explained between 51.9% and 94.0% of the variance in the number of flowering species for all species pooled, between 58.7% and 94.5% for herbaceous species, and between 3.5% and 76.7% for woody species (Table 1).
Analyses of 2.5% domain and 5% domain generated similar results (Appendix A).This suggests that our results were robust to this sampling strategy.

Latitudinal patterns of mid-domain effect
The 5% domain analyses showed that the variance explained by the mid-domain effect models was positively related with latitude for all species pooled (R 2 ¼ 0.37, P < 0.001) and for herbaceous species (R 2 ¼ 0.30, P ¼ 0.002; Figs. 2 and 3a and b).No significant regression was found between variance explained by the middomain effect models and latitude for woody species (Fig. 3c).The 2.5% domain analyses generated similar results (Appendix A).

Relationship between climatic variables and the flowering diversity
The observed numbers of species flowering were significantly related to the mid-domain effect model-predicted species richness and to all three climatic variables, but the relationships varied between herbs and woody species (Table 2).
For herbaceous species, the mid-domain effect had the greatest predictive power in explaining observed numbers of species flowering (50.8% of variance).Mean minimum monthly temperature and mean monthly precipitation were the second and third most important predictors, accounting for 33.5% and 12.5% of the variance, respectively.For woody species, the mid-domain effect had strong predictive power, explaining 78.7% of variance in the observed number of flowering species.Mean minimum monthly temperature accounted for 9.2% of the variance.Mean monthly precipitation and monthly sunshine duration were less informative, explaining less than 7% of variance.Similar results were found when we repeated the analyses using the 2.5% domain data (Appendix A).We found that interspecific overlap in the temporal ranges of flowering phenology within a defined temporal domain resulted in a mid-domain peak.The prevalent mid-domain 'saddle' in the panels of Fig. 1 is likely a consequence of many small and relatively equal-sized 'flowering periods' (Colwell et al., 2009).Our results suggest that the null model of a phenological mid-domain effect explains a large percentage of the temporal variation in flowering diversity.Thus, future studies should use mid-domain null models to examine factors that drive seasonal patterns of flowering diversity.
This mid-domain effect is consistent with a previous study that examined flowering phenology (Morales et al., 2005), although our study has some new findings.Firstly, while Morales et al. (2005) included only dozens of species, our study on mid-domain effect in flowering phenology included more than 16,000 species.That is, our study suggests that phenological mid-domain effects may be a general phenomenon.Secondly, Morales et al. (2005) only included forbs, whereas we found that the mid-domain effect may explain the flowering patterns of both herbs and woody species.Thirdly, Morales et al. (2005) study was conducted in sub-alpine communities in a temperate region, whereas our study covered temperate, subtropical, and north tropical regions.In short, our findings indicate that the mid-domain effect can explain patterns of flowering diversity in various ecosystems.
Perhaps the most novel finding of our study is that the middomain effect model explained a greater proportion of the variance in flowering patterns at higher latitudes than at lower latitudes, which is not what we predicted.However, this latitudinal pattern was only observed for herbaceous species.In other words, in regions with distinct flowering seasons, more species tend to flower near the temporal boundaries of the flowering season than near the middle (Colwell and Lees, 2000).Our results suggest that geometric constraints should be largely considered as drivers of patterns of flowering diversity in temperate ecosystems.Consequently, previous suggestions that plant assemblages are dominated by neutral processes at lower latitudes (like tropical rainforests) and by deterministic processes at higher latitudes (Hubbell, 2001;Rangel and Diniz-Filho, 2005;Clark, 2009) may be only suitable for spatial niches and not for temporal niches.Our findings suggest that plant flowering phenology in lower latitudes may be more influenced by species interactions and by environmental conditions.Another potential explanation for the better fit of the middomain model at high latitudes is that the shorter growing season found at higher latitudes means that the effective domain (in this case, the number of months over which species might flower) is smaller at higher latitudes, which may have resulted in a more strongly peaked mid-domain effect (Colwell and Lees, 2000) and potentially caused a greater match between the predictions of our null model and the observed data.
No latitudinal patterns were found for woody species, possibly because environmental factors, and especially biotic factors, influence flowering phenology strongly at both higher and lower latitudes.This theory is supported by previous findings that flowering patterns in both temperate and tropical forests are driven by climatic variables such as temperature, precipitation and photoperiod (Borchert et al., 2005;Stevenson et al., 2008;Pau et al., 2011;Flynn and Wolkovich, 2018).Compared to herbaceous species, the reproductive phenology of woody species may be more determined by biotic interactions, including both selection to coincide with the seasonality of mutualists (e.g., pollinators and seed dispersers) and selection to avoid the seasonality of pests (e.g., herbivores and seed predators) (Stiles, 1977;Brody, 1997).For example, woody species, and not herbs, exhibit mast flowering and fruiting in both temperate and tropical forests, with synchronous, species-wide, mass productions of flowers and seeds (Appanah, 1993;Wesolowski et al., 2015).The evolutionary mechanisms at work in masting include seed predator satiation (Janzen, 1971) and increased efficiency in pollination or fruit dispersal (Norton and Kelly, 1988;Kelly, 1994).Because seed size is usually small for herbs, herbaceous species are hard-pressed to satiate seed predators.Our study shows that only by considering all possible abiotic and biotic variables that affect the temporal patterns of flowering phenology may we gain a complete understanding of the evolution of flowering phenology.

Relationship between climatic variables and flowering diversity
To our knowledge, our study is the first to put climatic variables and mid-domain effect into one model to assess the relative importance of mid-domain effect and climatic factors in driving plant phenology patterns.We found that flowering diversity of herbaceous plant assemblages is most strongly related with the mid-domain effect, mean minimum monthly temperature, and mean monthly precipitation.This suggests that herbaceous plants may begin to flower earlier as the climate warms.This is supported by the finding that the phenology of herbs is driven by minimum temperature (e.g., Inouye, 2008;Matthews and Mazer, 2016;Augspurger and Salk, 2017).The flowering diversity of woody plant assemblages is primarily related to the mid-domain effect and temperature.In contrast to herbaceous plants, flower development in many trees is not continuous from flower induction to anthesis but may become temporarily arrested at some intermediate stage (Borchert, 1983).Final development of flower buds and anthesis of woody species occurs many months after flower initiation, and during this period, flowering may need other stimuli such as warm Table 2 Parameters from linear mixed-effects models (lmer) modeling the number of species flowering (log-transformed) as a function of mid-domain effect (MDE), mean minimum monthly temperature ( C, T min ), mean monthly precipitation (mm, MMP), and mean monthly sunshine duration (h, Sunshine), with a random intercept for province, based on dates for which 5% and 95% of cumulative flowering records occurred.'I.perc' column is the individual contribution percentage in the glmm.hpfunction.

All species
Herbaceous temperatures (van Schaik et al., 1993).One of the main findings of our study is that the mid-domain effect is the most important variable for explaining the phenological patterns of both herbaceous and woody species.
Our finding that the flowering diversities of woody and herbaceous plant assemblages are determined by different drivers suggests that any climate-driven and/or random reassembly will likely follow distinct trajectories (Ackerly, 2003).This might have major consequences for community reshuffling under climate change.We argue that the mid-domain effect must be considered, in addition to abiotic and biotic factors, when assessing the patterns of phenology in the future.Specifically, we should at least combine the mid-domain effect null model and environmental variables into a single model to investigate the relative importance of each variable on the number of species flowering each period.We believe that future studies should explore whether warming temperature is more important than the mid-domain effect under climate change scenarios and whether the relative importance of the middomain effect in explaining flowering diversity changes for herbaceous versus woody species in temperate, subtropical, and tropical forests.

Limitations of the study
Three aspects of our study require caution when interpreting the results.Firstly, assessing the contribution of the mid-domain effect to empirical patterns requires careful placement of domain boundaries because the shape and midpoint of a modeled the middomain effect are highly dependent on boundary locations.This problem is intrinsically more challenging for annual phenological datasets, which are cyclic rather than intrinsically bounded, unlike spatial domains.In this study, we tested the sensitivity of our conclusions to a rule-defined method of setting the boundaries by computing domain boundaries for each province based on the dates on which 2.5% and 97.5% of cumulative flowering records have occurred.Then, we tried boundaries based on 5% and 95% instead to test the sensitivity of the patterns to this method.We found that both analyses generated similar results, even as the domains changed, suggesting that our results were robust using this sampling strategy.
Secondly, compared to research by Morales et al. (2005), our study had larger time increments because we used monthly phenology data.It would have been more accurate to use more precise phenology data for each species in each province, but this data does not exist at present.We suggest that future studies use observer-based phenology data to test the effects of the middomain effect on flowering diversity along a latitudinal gradient.
In addition, we should be careful about the collinearity of the mid-domain effect with environmental variables (Letten et al., 2013).The mid-domain effects expected to be related to both abiotic parameters and biotic components, such as species interactions, which means that a species range is dependent on both the environment and other species.Therefore, caution is required when interpreting our results, although many previous studies also incorporated the mid-domain effect and environmental variables into multiple linear regression models to examine which variable contributed the most to explaining species richness gradients (e.g., ChangBae et al., 2013;Feng et al., 2017;Sun et al., 2020).In our study, we calculated the variance inflation factor to assess the multicollinearity among predictors, and we found that VIF was low, which suggests that collinearity was not a major problem.We also used the newly developed 'glmm.hp'R package to quantify the importance of each predictor variable, therefore facilitating our interpretation of the importance of each predictor, particularly when these variables were correlated (Lai et al., 2022).Future work should improve methodological approaches to account for the collinearity of the mid-domain effect with environmental variables.

Conclusions
Using the largest flowering phenology dataset to date, we provide independent confirmation that mid-domain models qualitatively matched patterns of flowering species richness.The middomain effect was among the most important variables in explaining the flowering patterns for both herbaceous and woody species, but it has been neglected in previous studies.Identifying the relative importance of mid-domain effects and abiotic and biotic processes on phenological patterns presents an interesting challenge that deserves further attention.Our results suggest that geometric constraints are of sufficient magnitude to warrant further consideration as drivers of flowering seasonality for temperate, subtropical and tropical forests.Our findings also indicate that other phenological patterns, such as leaf flush and fruit ripening times, might display mid-domain effects as well.These possibilities would be worthwhile directions to pursue in future research.

Fig. 1 .
Fig. 1.Observed and null model-predicted flowering diversity for 27 provinces in China, based on dates for which 5% and 95% of cumulative flowering records occurred.The fitted values (solid lines) and the 95% confidence interval predicted by the mid-domain models (gray bands) are shown in the plots.The R 2 and P-values reflect the relationship between the observed data and the predicted values from the mid-domain model.

Fig. 2 .
Fig. 2. The geographic patterns of the variance explained by the mid-domain effect models in China estimated at the provincial level for (a) all species pooled, (b) herbs and (c) woody species based on dates for which 5% and 95% of cumulative flowering records occurred.

Fig. 3 .
Fig. 3.The relationship between the variance explained by the mid-domain effect models and latitude for all species (left panel), herbaceous species (middle panel) and woody species (right panel), based on dates for which 5% and 95% of cumulative flowering records occurred.Each circle represents a single province.The red line is the fitted line from the linear model.

Table 1
Summary of the generalized linear models of the relationship between the observed number of species flowering and the mid-domain effect model-predicted number of species flowering, based on dates for which 5% and 95% of cumulative flowering records occurred.